Are large language models worth it?
Large language models (LLMs) may be transformative but pose serious risks and are already causing harm, prompting the author to question whether they are "worth it." The author, Nicholas Carlini, work…
Large language models (LLMs) may be transformative but pose serious risks and are already causing harm, prompting the author to question whether they are "worth it." The author, Nicholas Carlini, work…
Nicholas Carlini argues that advanced AI systems pose significant societal risks precisely because they are designed for "ruthless efficiency." He outlines a spectrum of concerns, from current harms l…
In a 2025 blog post, researcher Nicholas Carlini argues that the future of large language models is highly uncertain, with two plausible but opposing outcomes. He suggests that within three to five ye…
In his article, Nicholas Carlini explains that his research on "memorization" in machine learning models, which demonstrates that models can sometimes output verbatim training data, is often cited in …
Nicholas Carlini announced he is leaving Google DeepMind after seven years to join Anthropic for one year, citing disagreements with DeepMind leadership over its support for high-impact security and p…
In a 2025 article, Nicholas Carlini asked readers to make 30 forecasts about AI in 2027 and 2030, requiring them to give 90% confidence intervals rather than point estimates. Analyzing the responses, …
Nicholas Carlini describes a project where he uses a different large language model (LLM) each day for twelve days to completely rewrite his personal website homepage and bio. He prompts each model to…
Many people hold overly confident but vague predictions about AI's future, which are often proven wrong. To address this, author Nicholas Carlini presents a set of about 30 specific, refutable questio…
According to Nicholas Carlini's 2023 article, while computers have surpassed humans in chess for decades using specialized game-playing models, OpenAI's GPT-3.5-turbo-instruct—a language model designe…
A command injection vulnerability in GPT-4 discovered by Nicholas Carlini, where the model's use of the `<|endoftext|>` token can cause it to abruptly end a response and begin generating unrelated con…